TOPOLOGY OF DIOPHANTINE SETS: 
REMARKS ON MAZUR'S CONJECTURES 

GUNTHER CORNELISSEN AND KARIM ZAHIDI 



Abstract. We show that Mazur's conjecture on the real topology of rational 
points on varieties implies that there is no diophantine model of the rational 
integers Z in the rational numbers Q, i.e., there is no diophantine set D in 
some cartesian power Q l such that there exist two binary relations S, P on D 
whose graphs are diophantine in Q Sl (via the inclusion D 3 C Q 31 ), and such 
that for two specific elements doi^i G D the structure (D, S, P,do,di) is a 
model for integer arithmetic (Z, +, -, 0, 1). 

Using a construction of Pheidas, we give a counterexample to the analogue 
of Mazur's conjecture over a global function field, and prove that there is a 
diophantine model of the polynomial ring over a finite field in the ring of 
rational functions over a finite field. 



1. Introduction 

One of the main themes in model theory is to understand the structure of definable 
sets: given a first-order language L and an L-structure M, describe the L-definable 
subsets of M n for various n 6 Z >0 . Here, a set S C M n is called L-definable if 
there exists an L-formula ip(x) with free variables x = (xi, x n ) such that for any 
a e M n , a e S M \= tp (a). A set is called existentially definable (respectively 

positive existentially, or diophantine) if ip(x) can be taken to be 3b0(x, b), with <j> 
quantifier- free (respectively, quantifier- and negation- free, or atomic). 

The natural geometric examples of such structures arise as in the following def- 
inition: 

Definition 1.1. If R is a commutative ring with unit, it admits a natural inter- 
pretation for any first order language L of the form Lr = (+, -,=,0^) where Cj 
are primary predicates ("constants"), less in number than \R\. We call L# a ring 
language. We define Lz = (+, •, 0, 1) and L t = (+, •, 0, l,t) for any t e R. 

Example 1.2. (Tarski, cf. ||, pp. 202-206) (a) An algebraically closed field k ad- 
mits elimination of quantifiers in the language Lz . Hence any Lz-dcfinable subset 
in k n is a boolean combination of sets defined by an equation. Thus, the definable 
sets for an algebraically closed field are exactly the classical sets of algebraic ge- 
ometry - one deduces for example that the only definable subsets of k are finite or 
cofinite, a fact which at first sight seems not so obvious. 

(b) The field of real numbers R admits elimination of quantifiers in the language 
L> = (0, 1, +, •, >) of ordered fields. Hence every definable set in R™ is a boolean 
combination of semi-algebraic sets (i.e., solution sets to systems of equations of the 



1991 Mathematics Subject Classification. 03D35, 14G05. 

The first author is Post-doctoral fellow of the Fund for Scientific Research - Flanders (FWO). 

1 



2 



G. CORNELISSEN AND K. ZAHIDI 



form /(x) = A g(x) > 0). This gives a nice description of the definable subsets of 
R: they are finite unions of intervals. 

(c) More examples in the same vein exist, e.g., a description of definable sets 
overp-adic fields (0], ||), or generalization of (R, L>) via o-minimal expansions. 

(d) To give an example with a different language, existentially definable sets of Z 
in the language (0, 1, +, |) are unions of arithmetic progressions (a result of Lipshitz 

ED)- 

The moral is that if the (existentially) definable sets for such M have a suf- 
ficiently easy description, then the first-order (respectively, existential) theory of 
M is decidable - this is the case in the above examples. Conversely, if definable 
sets are combinatorially complicated, one expects the corresponding theory to be 
undecidable. 

Example 1.3. (a) Consider the rational integers (Z,Lz). It is impossible to de- 
scribe the Lz-definable sets of Z in terms of "classical" sets (e.g., finite sets, arith- 
metic progressions, ... ). Eventually, this leads to the undecidability of the full 
theory of (Z, L-£). 

(b) The celebrated theorem of Davis, Matijasevich, Putnam and Robinson de- 
scribes the existentially definable sets of Z: they are exactly the recursively enu- 
merable sets, whose complexity outranges by far that of decidable (hence, certainly, 
of computable) sets - and the undecidability of the existential theory of (Z.Lz) 
follows. 

This maxim, the interplay between (un)decidability and definable sets, applies 
in particular to the field (Q,Lz) of rational numbers. The field structure of Q 
admits the same kind of "wild" definable sets as the integers; this follows from 
J. Robinson's theorem that Z is a definable subset of Q ([glj, theorem 3.1). The 
question whether the same can happen if we restrict to the existentially definable 
sets is still open. 

In the next paragraph, we will present a conjecture by Mazur, which - although 
it does not characterize the existentially definable sets of Q - poses severe restric- 
tions on their real topological structure. In the subsequent section, we prove that 
this conjecture implies there is no "diophantine model" (cf. infra) of (Z, Lz) in 
(Q.Lz) - this generalizes Mazur's observation that his conjecture implies that Z 
is not an Lz-diophantine subset of Q. In particular, any proof of the diophantine 
undecidability of Q "along traditional lines" fails if Mazur's conjecture is true. In 
the final paragraph, we comment upon a non-archimedean version of this conjec- 
ture. Though most of these observations are folklore, they do not seen to have been 
written down previously. 

2. Mazur's conjectures 

In jl4| , jl5| and |l6| , Barry Mazur has proposed and discussed several conjectures 
and questions about the behaviour of the set of Q-rational points of a variety over 
Q under taking topological closure w.r.t. some metric induced by a valuation on 
Q. The conjecture that we will concentrate upon (the weakest) is the following: 

Conjecture 2.1. (Mazur fUSf , Conjecture 3) For any variety V over Q, the (real) 
topological closure of V(Q) in V(R) has only a finite number of real topological 
components. 
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There is some evidence for this conjecture, especially for such V which possess 
special geometric properties (mostly related to the canonical class of V) - and no 



counterexample to it is known. Also observe that, with Q replaced by R in 2.1 
the "conjecture" says that a real variety has only finitely many real connected 
components. This holds true; it could be deduced from Tarski's results - there is 
even an explicit bound on the betti numbers of V(R), the so-called Milnor-Thom 
theorem (cf. 0). 



Example 2.2. (a) Conjecture 2.1 is true for curves V. One can assume V to 
be projective and non-singular. The case where V has genus g > 2 is settled by 
Faltings's theorem, which says that V(Q) is a finite set (Q). If V has genus 0, 
then either V(Q) is empty, or V is Q-birational to A 1 , and A X (Q) is topologically 
dense in A X (R). Finally, assume that V has genus 1. It is known that V(R) is 
isomorphic to the "circle group" R/Z or to R/Z x Z/2 (see |2^], V). Every proper 
closed subgroup of the circle group is finite (see |^|, theorem 1.34). Hence, if V(Q) 
is not finite, then it is dense in every component of V(R) that it intersects. 

(b) To provide a higher dimensional example, let V be a variety satisfying weak 
approximation (i.e., such that V(Q) <— ► E[^(Qp) i s dense). Then the conjecture 
holds true for V. This holds, e.g., if V is a smooth complete intersection of two 
quadrics in projective space of dimension at least 5 (cf. pi). 

Remark 2.3. Mazur has made even stronger conjectures, some of which had to be 
slightly modified, due to the construction of a counterexample by Colliot-Thelene, 
Skorobogatov and Swinnerton-Dyer For an extensive (unsurpassable) exposi- 

tion and more examples, we refer to the original sources 0, jl5) and Jl(|. We will 
concentrate on the model-theoretical aspects of the conjectures, which are already 



present in 2.1 - but let the reader be warned about making too bold generalizations 



of 2.1. A non-archimedean version will be considered in the last paragraph of this 



paper. 

Remark 2.4. The (Q, Lz)-existentially definable subsets, in the sense of the in- 
troduction, are precisely images of projections from V(Q) to affine space Aq for 
various V and n. 

A more model-theoretic conjecture would be that the real topological closure of a 
(Qi Lz)-existentially definable set is an (R, L>)- definable set (i.e., a semi-algebraic 



set). This implies 2.1, since a semi-algebraic set has only finitely many components. 



We do not know whether conjecture 2.1 is equivalent to this statement. Note 
that J. Robinson's argument (in pl| ) shows that it is wrong when the word "exis- 
tentially" is erased. 

3. DlOPHANTINE MODELS OF Z IN Q 



Mazur has observed that conjecture 2.1 implies that Z is not diophantine in Q 



in the language Lz! indeed, if Z would arise as the projection of V(Q) for some 
variety V, then since Z has infinitely many real components and the projection is 
continuous, the same would hold for V(R). 

However, many proofs of the undecidability of the diophantine theory of struc- 



tures (i?, Lr) as in (1.1) do not give that Z is a diophantine subset of R, but rather 
produce a diophantine model of (Z, L%) in (R,Lr), in the sense of the following 
definition: 
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Definition 3.1. A model (M,L,(j>) is a triple consisting of a first order language 
L which consists of i-ary predicates {Pi, a }, a set M and an interpretation of L 
in M (we will often leave out 4> of the notation). Note that any cartesian power 
M k , (k > 1) is likewise a model for L via "diagonal interpretation". 

We say that a model (M' , L' = {-P/q,},^') admits a diophantine model in 
(M, L, </>) if there exists a set-theoretical bijection between M' and a subset of 
some cartesian power M (k > 1), such that the image is diophantine, and such 
that the induced inclusions <f>'(P{ a ) Q M lk are diophantine. 

A similar notion of (positive) existential model exists. 

Example 3.2. (a) If (M 2 , L) admits a diophantine model in (M, L), then the latter 
structure is said to admit diophantine storing (cf. [Q]). This is true, for example, for 
(Z,_Lz). For non-algebraically closed rings (R,Lr) admitting diophantine storing, 
one can always choose k = 1 in the above definition. For if (M' , L' ,(f>') admits 
a diophantine model in (R 2 ,Lr) and (R,Lr) admits diophantine storing, then 
(M', L' , (j)') admits a diophantine model in (R, Lr) (since conjuntions of diophantine 
formulas are again diophantine if the quotient field of R is not algebraically closed 
- cf. |, §3). 

(b) Typically, diophantine models of the integers (Z,Lz) hi ring languages 
(R,Lr) arise in the following way: a commutative algebraic group G (e.g., the 
multiplicative group of a quadratic ring, or an elliptic curve) is assumed to have 
rank one over R, and the set Z has a diophantine model as the i?-rational points 
G(R) of G - the relation "addition" is automatically mapped to a diophantine sub- 
set of G 3 (R), since the group law on G is a morphism. The most problematic point 
is defining the relation "multiplication" . For an example, consider the proof that 
(Z, Lz) admits a diophantine model in (R :— S[t], L t ) for any commutative unitary 
domain S of characteristic zero, see Denef j|. He takes for G the torus G m 

of discriminant A = t 2 — 1, which is non-split over R; G(R) has rank one: any 
i?-point is given by a solution (x n ,y n ) to the Pell-equation X 2 — AY 2 = 1 (i.e., 
a power u n — x n + y n \^A of the fundamental unit u = t + \/A). Multiplication 
(x r , y r ) ■ (x s ,y s ) = (x n ,y n ) is defined by saying that / := y n — y r ■ y s has a zero at 
t = 1, i.e., (3h G R)(f = (t- l)h). 

(c) It is not known whether, if the ring R contains Z, the set Z itself is a 
diophantine subset of R whenever (Z, Lz) admits a diophantine model in (R, Lz)- 

The following result formalizes the technique of proof of many undecidability 
results: 



Observation 3.3. Assume that R is as in 1.1, and there is a polynomial whose 
coefficients belong to 4>(L) but that has no zero in the fraction field of R. If (M' , L') 
has an undecidable diophantine theory, and admits a diophantine model in (R,Lji), 
then the diophantine theory of (R, Lr) is undecidable. □ 



Remark 3.4. Without any restrictions on R, if (M',L') has an undecidable (posi- 
tive) existential theory and admits a (positive) existential model in (R,Lr), then 
the (positive) existential theory of (R, Lr) is undecidable. 

The technique of many undecidability proofs for rings (R, Lr) is based on the 
fact that, via a construction as in ( |3.2K b)), one can find a diophantine model of the 
integers (Z,Lz) in (R,Lr), and then rely on the fact that the diophantine theory 
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of the integers is undecidable (jTjJ, 0). It has been suggested that, with this more 
flexible definition, one would be able to find a diophantine model of the integers in 
the rationals: 

Question 3.5. Docs (Z, Lz) admit a diophantine model in (Q,Lz)? 
However, even this is impossible if we assume Mazur's conjecture: 



Theorem 3.6. Mazur's conjecture 2.1 implies that there is no diophantine model 
of(Z,L z )in(Q,L z ). 

Proof. Assume that there is such a diophantine model (D,Ld), with D C Q fe . 
Then there is an afflne variety V over Q admitting a finite morphism / : Vq — > Aq 
defined over Q such that f(V(Q)) — D. 

If D is discrete (i.e., infinite and totally disconnected in the real topology), 
the traditional proof applies: the real topological closure of V(Q) in V(R) is also 
mapped to D by /, and hence it has infinitely many components. 

If D is not discrete (which seems to be the case for the typical infinite diophantine 
set in Q, say, the set of squares), then we show that one can select (in a computable 
way) a discrete subset D of D. Then the above proof, applied to D, gives the result. 

Here are the details of the construction of D. We only have to treat the case 
where the real topological closure V of V(Q) has only finitely many connected 
components. Since / is continuous, the mean value theorem implies that f(V) 
is the union of finitely many closed subsets in R fc . In particular, the topological 
closure D oi D contains finitely many closed subsets, and since D is infinite, one of 
these subsets, say, Dq, is not a point. By composing / with a suitable Q-rational 
projection tt : Aq — > Aq which does not map Dq to a point, we may assume 
k = 1. By composing with a fractional linear transformation defined over Q, we 
may assume tt(Dq) to be the unit interval / = [0, 1]. Let d n be the element of d 
corresponding to n £ Z. Let us consider the set 

z = {nez\ 2 j + i - 7r ( rf ») - Yj for somc J e Z> °J'' 

The set Z is Turing computable (since D = {ir(d n )} is a listable subset of Q, it is 
easy to write a Turing program to check the inequalities), hence it is recursively 
enumerable (by Kleene's normal form theorem, cf. ^3|, 2.3-2.4), so by ||, it is 
diophantine in (Z, Lz). Also, Z is infinite, since ir(D) HI is dense in /. We now set 

D = {d n \ neZ}. 

By construction, the set D is diophantine in (D,Lo), and hence a fortiori in 
(Q,Lz). So there exists a variety V over Q and a Q-morphism / : V — > Aq 
such that /(t^(Q)) = D. However, the real closure of the set D has infinitely many 
connected components in the real topology by construction. Hence the same holds 
for V(Q), contradicting Mazur's conjecture. □ 

4. NON- ARCHIMEDEAN ASPECTS OF MAZUR'S CONJECTURES 

In (]l^j, II. 2), Mazur has devised a conjecture of the above type which applies to 
any completion of a number field, not just an archimedean one. As it makes sense 
for any global field, we formulate it as follows: 
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Question 4.1. Let V be a variety over a global field K, v a valuation on K, and K v 
the completion of K w.r.t. v. For every point p £ ^(-K,,), let be the Zariski 

closure of U(X(^0 ^ ^Oj where £/ ranges over all u-open neighbourhoods of p in 
V(K V ). Is the set {W(p) : p £ V^fQ} finite? 

In our next theorem, we observe that the answer to this question is negative in 
positive characteristic: 

Theorem 4.2. Let K = F q (t) be the rational function field over a finite field F q of 
positive characteristic p > 0, and let v be the valuation corresponding to the place 
t^ 1 of K {i.e., v{a) = q dc s( a ) f or a £ F g [t]). Then there is a variety V over K for 



which the answer to question £A is negative. 

Proof. In Jl9| (lemma 1) Pheidas proves that, for p > 2, projection onto the in- 
coordinate of the if -rational points of the space curve V p given by 

V p : x — t = u p — u, x^ 1 — t^ 1 = v p — v 

gives the set D p = {t pS \ s £ Z> }. For p = 2, Videla (Q) proved that the set D 2 
is the projection onto the x-coordinate of 

V 2 : x + t = u 2 + u, u = w 2 + t, aT 1 + r 1 = v 2 + v, v = s 2 + t~ l . 

Already the sets W(p) for p E V(K) are disjoint, since their ^-coordinates are 
separated (v(t p ~t p ) > 1 for all r^s). This gives a negative answer to question 
111 □ 



Thinking of the analogy between function fields and number fields, one can ask 



for the strict analogue of question 3.5 for global function fields. The answer to it is 
positive: 

Theorem 4.3. For any prime power q, q — p n , p > 0, the polynomial ring 
(F q [t],Lt) admits a diophantine model in the ring of rational functions (F q (t), L t ). 

Proof. The proof is a bit indirect: we show that the polynomial ring has a diophan- 
tine model in the positive rational integers, and the latter has a diophantine model 
in the field of rational functions. 

More precisely, F q [t] is a recursive ring (cf. Rabin p0||), because F g is recursive 
(since finite), and hence the same holds for the polynomial ring over F q (cf. Frohlich 
and Sheperdson ]?]]). So there exists an injective map 9 : F q [t] — > Z> such 
that the graphs of addition and multiplication are recursive on Z>o, and hence 
(9(F q [t]),9(L t )) is a diophantine model of (F q [t],L t ) in (Z> ,£z)- 

For the second step, we first recall a construction of Pheidas and Videla 
|24[ ). Let v denote the t-valuation on F q (t), i.e., v(x) is the order of x at zero. For 
any k £ Z>o, let [k] denote the equivalence class of elements x £ F q (t) with v(x) = 
k. For positive integers a and b, let a \ p b denote the relation (3n £ Z> )(a = bp n ). 

Consider the structure S = (Z>o, (+, | p , 0, 1)). Firstly, multiplication is diophan- 
tine in S corollary on p. 529). Secondly, the set of equivalence classes [k] as 
above is a model for S in which the relations in S can be defined by diophantine 
formulas in (F q (t), L t ) between arbitrary representatives of the equivalence classes 
in F q (t). We conclude that for arbitrary elements x,y,z £ F q (t) the relations 
[v(x)] = [v(y) + v(z)} and [v(x)] = [v(y) ■ v(z)] are diophantine in F q (t). 

The problem with this encoding is that we do not know the existence of a dio- 
phantine set in F q (t) which contains exactly one representative for each such equiv- 
alence class. We fix this problem as follows. Wc know from the proof of theorem 
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4.2 that the set D p = {t p , k £ Z> } is diophantine in (F q (t), L t ), and this will be 
our model. 

To define addition and multiplication on elements of this set, we introduce the 
following switching between t p and [k]: the set {(k,p k ), k £ Z>o} is recursively 
enumerable in Z> , so by Matijasevich's theorem, it is diophantine in Z>o- Then, 
by the aforementioned results, the set £ = {([fc],[p fc ]), k £ Z> } is diophantine 
over (F,(i), /. t ). 

For the function symbols R £ {+, •} on Z>o, we let the corresponding symbol R 
for x, y, z £ Z? p be defined by 

2 = xRy ^(3 x i ,yi,z l £ F q (t))(((x 1 ,x),(yx,y),(z 1 ,z)) £ £ 3 A = [z x ]). 

For i? G {+, •}, the righthand side of the equivalence is diophantine in (F q (t),L t ) 
by what we have said before and the fact that for any two elements W\,W2 £ F q (t), 
the statement [w\] = [vj2\ is equivalent with (v{w\/w2) > 0) A {v{w2/w\) > 0), 
which is diophantine by and |24j . 

Finally, (D p , +, r , t, t p ) is a diophantine model of (Z,Lz) in (F 9 (<),L t ). This 
finishes the proof of the theorem. □ 

Of course, the above theorem still does not settle the following problem: 

Question 4.4. Is the polynomial ring F q [t] a diophantine subset of the field of ra- 
tional functions F g (t)? 
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